AITopics | sobolev training

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-11-2026, 03:37:40 GMT

449a016a6ce6fba3fe50d05482abf836-Paper-Conference.pdf

dnn derivative, log 2, neural network, (13 more...)

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Schramm, Fabian, Perrin-Gilbert, Nicolas, Carpentier, Justin

First-order Sobolev Reinforcement Learning

arXiv.org Artificial IntelligenceNov-25-2025

We propose a refinement of temporal-difference learning that enforces first-order Bellman consistency: the learned value function is trained to match not only the Bellman targets in value but also their derivatives with respect to states and actions. By differentiating the Bellman backup through differentiable dynamics, we obtain analytically consistent gradient targets. Incorporating these into the critic objective using a Sobolev-type loss encourages the critic to align with both the value and local geometry of the target function. This first-order TD matching principle can be seamlessly integrated into existing algorithms, such as Q-learning or actor-critic methods (e.g., DDPG, SAC), potentially leading to faster critic convergence and more stable policy gradients without altering their overall structure.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2511.19165

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsNov-21-2025, 15:41:46 GMT

Sobolev Training for Neural Networks

name change, neural network, sobolev training, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Wojciech M. Czarnecki, Simon Osindero, Max Jaderberg, Grzegorz Swirszcz, Razvan Pascanu

Sobolev Training for Neural Networks

Neural Information Processing SystemsNov-21-2025, 11:11:51 GMT

At the heart of deep learning we aim to use neural networks as function approxi-mators - training them to produce outputs from inputs in emulation of a ground truth function or data creation process.

artificial intelligence, machine learning, neural network, (16 more...)

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Fisher, Katharine E, Li, Matthew TC, Marzouk, Youssef, Schorlepp, Timo

Precise asymptotic analysis of Sobolev training for random feature models

arXiv.org Machine LearningNov-6-2025

Gradient information is widely useful and available in applications, and is therefore natural to include in the training of neural networks. Yet little is known theoretically about the impact of Sobolev training -- regression with both function and gradient data -- on the generalization error of highly overparameterized predictive models in high dimensions. In this paper, we obtain a precise characterization of this training modality for random feature (RF) models in the limit where the number of trainable parameters, input dimensions, and training data tend proportionally to infinity. Our model for Sobolev training reflects practical implementations by sketching gradient data onto finite dimensional subspaces. By combining the replica method from statistical physics with linearizations in operator-valued free probability theory, we derive a closed-form description for the generalization errors of the trained RF models. For target functions described by single-index models, we demonstrate that supplementing function data with additional gradient data does not universally improve predictive performance. Rather, the degree of overparameterization should inform the choice of training method. More broadly, our results identify settings where models perform optimally by interpolating noisy function and gradient data.

artificial intelligence, generalization error, machine learning, (20 more...)

arXiv.org Machine Learning

2511.0305

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.13)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Neural Information Processing SystemsOct-8-2025, 14:06:18 GMT

Nearly Optimal VC-Dimension and Pseudo-Dimension Bounds for Deep Neural Network Derivatives

Obtaining such bounds for DNN derivatives is much more difficult due to their complex compositional structures.

artificial intelligence, machine learning, neural network, (16 more...)

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Oh, Jong Kwon, Lyu, Hanbaek, Son, Hwijae

Sobolev acceleration for neural networks

arXiv.org Artificial IntelligenceSep-25-2025

Sobolev training, which integrates target derivatives into the loss functions, has been shown to accelerate convergence and improve generalization compared to conventional $L^2$ training. However, the underlying mechanisms of this training method remain only partially understood. In this work, we present the first rigorous theoretical framework proving that Sobolev training accelerates the convergence of Rectified Linear Unit (ReLU) networks. Under a student-teacher framework with Gaussian inputs and shallow architectures, we derive exact formulas for population gradients and Hessians, and quantify the improvements in conditioning of the loss landscape and gradient-flow convergence rates. Extensive numerical experiments validate our theoretical findings and show that the benefits of Sobolev training extend to modern deep learning tasks.

artificial intelligence, machine learning, sobolev training, (16 more...)

2509.19773

Country:

Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Italy (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Rosemberg, Andrew W., Garcia, Joaquim Dias, Bent, Russell, Van Hentenryck, Pascal

Sobolev Training of End-to-End Optimization Proxies

arXiv.org Artificial IntelligenceMay-19-2025

Optimization proxies - machine learning models trained to approximate the solution mapping of parametric optimization problems in a single forward pass - offer dramatic reductions in inference time compared to traditional iterative solvers. This work investigates the integration of solver sensitivities into such end to end proxies via a Sobolev training paradigm and does so in two distinct settings: (i) fully supervised proxies, where exact solver outputs and sensitivities are available, and (ii) self supervised proxies that rely only on the objective and constraint structure of the underlying optimization problem. By augmenting the standard training loss with directional derivative information extracted from the solver, the proxy aligns both its predicted solutions and local derivatives with those of the optimizer. Under Lipschitz continuity assumptions on the true solution mapping, matching first order sensitivities is shown to yield uniform approximation error proportional to the training set covering radius. Empirically, different impacts are observed in each studied setting. On three large Alternating Current Optimal Power Flow benchmarks, supervised Sobolev training cuts mean squared error by up to 56 percent and the median worst case constraint violation by up to 400 percent while keeping the optimality gap below 0.22 percent. For a mean variance portfolio task trained without labeled solutions, self supervised Sobolev training halves the average optimality gap in the medium risk region (standard deviation above 10 percent of budget) and matches the baseline elsewhere. Together, these results highlight Sobolev training whether supervised or self supervised as a path to fast reliable surrogates for safety critical large scale optimization workloads.

artificial intelligence, machine learning, sensitivity, (15 more...)

2505.11342

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Enabe, Paulo Akira F., Provasi, Rodrigo

A Hybrid Virtual Element Method and Deep Learning Approach for Solving One-Dimensional Euler-Bernoulli Beams

arXiv.org Artificial IntelligenceJan-12-2025

A hybrid framework integrating the Virtual Element Method (VEM) with deep learning is presented as an initial step toward developing efficient and flexible numerical models for one-dimensional Euler-Bernoulli beams. The primary aim is to explore a data-driven surrogate model capable of predicting displacement fields across varying material and geometric parameters while maintaining computational efficiency. Building upon VEM's ability to handle higher-order polynomials and non-conforming discretizations, the method offers a robust numerical foundation for structural mechanics. A neural network architecture is introduced to separately process nodal and material-specific data, effectively capturing complex interactions with minimal reliance on large datasets. To address challenges in training, the model incorporates Sobolev training and GradNorm techniques, ensuring balanced loss contributions and enhanced generalization. While this framework is in its early stages, it demonstrates the potential for further refinement and development into a scalable alternative to traditional methods. The proposed approach lays the groundwork for advancing numerical and data-driven techniques in beam modeling, offering a foundation for future research in structural mechanics.

equation, formulation, neural network, (14 more...)

2501.06925

Country:

South America > Brazil > São Paulo (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)